[X86] Re-enable DA8W4 path on X86 CPU by Xia-Weiwen · Pull Request #4033 · pytorch/ao

Xia-Weiwen · 2026-03-10T05:17:43Z

Summary
This PR re-enables DA8W4 path on X86 CPU with Int8DynamicActInt4WeightOpaqueTensorConfig and Int4OpaqueTensor and updates UT in test/quantization/test_da8w4_cpu.py

Test plan
python test/quantization/test_da8w4_cpu.py

…nsorConfig and Int4OpaqueTensor

pytorch-bot · 2026-03-10T05:17:47Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/4033

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit f58c16f with merge base 3d02561 ():

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Run Regression Tests / test-nightly (CPU Nightly, linux.4xlarge, --pre torch --index-url https://download.pytorch.org/wh... / linux-job (gh) (trunk failure)
test/quantization/pt2e/test_x86inductor_fusion.py::TestDynamicPatternMatcher::test_q_attention_block
Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh) (trunk failure)
test/quantization/pt2e/test_x86inductor_fusion.py::TestDynamicPatternMatcher::test_q_attention_block

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copilot

Pull request overview

This PR re-enables the DA8W4 (dynamic int8 activation + int4 weight) CPU path on x86 by extending Int4OpaqueTensor to support DA8W4 packing/execution and adding a new quantization config and unit tests for the workflow.

Changes:

Add DA8W4 weight quantization + prepack (from_hp_da8w4) and a DA8W4 aten.linear dispatch path in Int4OpaqueTensor.
Introduce Int8DynamicActInt4WeightOpaqueTensorConfig and its module transform to apply DA8W4 quantization using Int4OpaqueTensor.
Restore/expand DA8W4 CPU unit tests (test/quantization/test_da8w4_cpu.py) and export the new config in the package __init__.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
torchao/prototype/int4_opaque_tensor/int4_opaque_tensor.py	Adds DA8W4 weight quantize+prepack and dynamic-activation `linear` implementation using `da8w4_linear_cpu`.
torchao/prototype/int4_opaque_tensor/inference_workflow.py	Adds a new DA8W4 config + quantize-module handler for `Int4OpaqueTensor`.
torchao/prototype/int4_opaque_tensor/init.py	Exposes the new DA8W4 config in the public prototype package API.
test/quantization/test_da8w4_cpu.py	Adds DA8W4 CPU tests validating compilation/codegen and basic accuracy.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

test/quantization/test_da8w4_cpu.py

torchao/prototype/int4_opaque_tensor/int4_opaque_tensor.py

torchao/prototype/int4_opaque_tensor/inference_workflow.py

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-10T07:29:06Z

torchao/prototype/int4_opaque_tensor/int4_opaque_tensor.py


+    # DA8W4 path: dynamic int8 activation + int4 weight
+    if weight_tensor.act_mapping_type is not None:
+        if weight_tensor.act_mapping_type == MappingType.SYMMETRIC:


act_mapping_type is stored on Int4OpaqueTensor as a string ("symmetric"/"asymmetric"), but this dispatch compares it to MappingType.SYMMETRIC. That condition will never be true, so the symmetric-version gate here is ineffective and the code is inconsistent with _da8w4_linear (which checks the string). Make the representation consistent (e.g., store MappingType in the tensor attribute and compare against MappingType.*, or keep it as a string and compare against "symmetric").

Suggested change

if weight_tensor.act_mapping_type == MappingType.SYMMETRIC:

if weight_tensor.act_mapping_type == "symmetric":

Copilot · 2026-03-10T07:29:07Z

torchao/prototype/int4_opaque_tensor/inference_workflow.py

+    if config.set_inductor_config:
+        torchao.quantization.utils.recommended_inductor_config_setter()
+
+    assert hasattr(module, "weight"), (
+        "applying DA8W4 quant requires module to have weight attribute"
+        + f" but {module} does not have one"
+    )


The DA8W4 module transform quantizes/prepacks weights unconditionally. If the DA8W4 CPU kernels aren’t built/registered (or if running on an older PyTorch that doesn’t support the needed path), this will still replace module.weight with an Int4OpaqueTensor and the first linear() call will fail at runtime. Consider adding an early guard here (similar to the unit test) that checks kernel availability via torch._C._dispatch_dump("torchao::da8w4_linear_cpu") and a torch_version_at_least("2.7.0") (and 2.8.0 for symmetric) before applying the transform; otherwise log and return the original module.

Re-enable DA8W4 path on X86 CPU with Int8DynamicActInt4WeightOpaqueTe…

e3300e9

…nsorConfig and Int4OpaqueTensor

Xia-Weiwen added the module: not user facing Use this tag if you don't want this PR to show up in release notes label Mar 10, 2026

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 10, 2026

Xia-Weiwen requested a review from Copilot March 10, 2026 05:17

Copilot started reviewing on behalf of Xia-Weiwen March 10, 2026 05:18 View session

Copilot AI reviewed Mar 10, 2026

View reviewed changes

test/quantization/test_da8w4_cpu.py Show resolved Hide resolved

torchao/prototype/int4_opaque_tensor/int4_opaque_tensor.py Show resolved Hide resolved

torchao/prototype/int4_opaque_tensor/inference_workflow.py Show resolved Hide resolved

Refine code

caa618b

Xia-Weiwen requested a review from Copilot March 10, 2026 07:24

Copilot started reviewing on behalf of Xia-Weiwen March 10, 2026 07:25 View session

Copilot AI reviewed Mar 10, 2026

View reviewed changes

Xia-Weiwen added 2 commits March 10, 2026 08:44

Refine code

2a9ff0e

Merge branch 'main' into fix_da8w4

f58c16f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[X86] Re-enable DA8W4 path on X86 CPU#4033

[X86] Re-enable DA8W4 path on X86 CPU#4033
Xia-Weiwen wants to merge 4 commits intopytorch:mainfrom
Xia-Weiwen:fix_da8w4

Xia-Weiwen commented Mar 10, 2026

Uh oh!

pytorch-bot bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 10, 2026

Uh oh!

Copilot AI Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	if weight_tensor.act_mapping_type == MappingType.SYMMETRIC:
	if weight_tensor.act_mapping_type == "symmetric":

Conversation

Xia-Weiwen commented Mar 10, 2026

Uh oh!

pytorch-bot bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/4033

✅ You can merge normally! (2 Unrelated Failures)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot bot commented Mar 10, 2026 •

edited

Loading